Search | WHO COVID-19 Research Database

1.

Comparing full variation profile analysis with the conventional consensus method in SARS-CoV-2 phylogeny (preprint)

Regina Nora Fiam; Istvan Csabai; Norbert Solymosi.

biorxiv; 2023.

Preprint in English | bioRxiv | ID: ppzbmed-10.1101.2023.08.03.551784

ABSTRACT

This study proposes a novel approach to studying SARS-CoV-2 virus mutations through sequencing data comparison. Traditional consensus-based methods, which focus on the most common nucleotide at each position, might overlook or obscure the presence of low-frequency variants. Our method, in contrast, retains all sequenced nucleotides at each position, forming a genomic matrix. Utilizing simulated short reads from genomes with specified mutations, we contrasted our genomic matrix approach with the consensus sequence method. Our matrix methodology accurately reflected the known mutations and true compositions, demonstrating its efficacy in understanding the sample variability and their interconnections. Further tests using real data from GISAID and NCBI-SRA confirmed its reliability and robustness. As we see, the genomic matrix approach offers a more accurate representation of the viral genomic diversity, thereby providing superior insights into virus evolution and epidemiology. Future application recommendations are provided based on our observed results.

2.

Systematic detection of co-infection and intra-host recombination in more than 2 million global SARS-CoV-2 samples (preprint)

Anna Medgyes-Horváth; Orsolya Pipek; József Stéger; Krisztián Papp; Dávid Visontai; Marion Koopmans; David Nieuwenhuijse; Bas Oude Munnink; István Csabai; VEO Technical Working Group.

researchsquare; 2023.

Preprint in English | PREPRINT-RESEARCHSQUARE | ID: ppzbmed-10.21203.rs.3.rs-3159433.v1

ABSTRACT

Systematic monitoring of SARS-CoV-2 co-infections between different lineages and assessing the risk of intra-host recombinant emergence are crucial for forecasting viral evolution. Here we present a comprehensive analysis of more than 2 million SARS-CoV-2 raw read datasets submitted to the European COVID-19 Data Portal to identify co-infections and intra-host recombination. Co-infection was observed in 0.35% of the investigated cases. Two independent procedures were implemented to detect intra-host recombination. We show that sensitivity is predominantly determined by the density of lineage-defining mutations along the genome, thus we used an expanded list of mutually exclusive defining mutations of specific variant combinations to increase statistical power. We call attention to multiple challenges rendering recombinant detection difficult and provide guidelines for the reduction of false positives arising from chimeric sequences produced during PCR amplification. Additionally, we identify three recombination hotspots of Delta – Omicron BA.1 intra-host recombinants.

Subject(s)

COVID-19 , Coinfection

3.

Mobilisation and analyses of publicly available SARS-CoV-2 data for pandemic responses (preprint)

Nadim Rahman; Ahmad Zyoud; Alexey Sokolov; Bas Oude Munnink; Björn Andreas Grüning; Carla Cummins; Clara Amid; David F Nieuwenhuijse; Dávid Visontai; David Yu Yuan; Dipayan Gupta; Divyae Prasad; Gábor Máté Gulyás; Gabriele Rinck; Jasmine McKinnon; Jeff Knaggs; Jeffrey Edward Skiby; József Stéger; Khadim Gueye; Krisztián Papp; Maarten Hoek; Manish Kumar; Marianna A. Ventouratou; Marie-Catherine Bouquieaux; Martin Koliba; Milena Mansurova; Muhammad Haseeb; Nathalie Worp; Peter W Harrison; Rasko Leinonen; Ross Thorne; Sandeep Selvakumar; Sarah Hunt; Sundar Venkataraman; Suran Jayathilaka; Timothée Cezard; Wolfgang Maier; Zahra Waheed; Zamin Iqbal; Frank M. Aarestrup; Istvan Csabai; Marion Koopmans; Tony Burdett; Guy Cochrane.

biorxiv; 2023.

Preprint in English | bioRxiv | ID: ppzbmed-10.1101.2023.04.19.537514

ABSTRACT

The COVID-19 pandemic has seen large-scale pathogen genomic sequencing efforts, becoming part of the toolbox for surveillance and epidemic research. This resulted in an unprecedented level of data sharing to open repositories, which has actively supported the identification of SARS-CoV-2 structure, molecular interactions, mutations and variants, and facilitated vaccine development and drug reuse studies and design. The European COVID-19 Data Platform was launched to support this data sharing, and has resulted in the deposition of several million SARS-CoV-2 raw reads. In this paper we describe (1) open data sharing, (2) tools for submission, analysis, visualisation and data claiming (e.g. ORCiD), (3) the systematic analysis of these datasets, at scale via the SARS-CoV-2 Data Hubs as well as (4) lessons learned. As a component of the Platform, the SARS-CoV-2 Data Hubs enabled the extension and set up of infrastructure that we intend to use more widely in the future for pathogen surveillance and pandemic preparedness.

Subject(s)

COVID-19

4.

Identification of mutations in SARS-CoV-2 PCR primer regions (preprint)

Anikó Mentes; Krisztián Papp; Dávid Visontai; József Stéger; VEO Technical Working Group; István Csabai; Anna Medgyes-Horváth; Orsolya Anna Pipek.

researchsquare; 2022.

Preprint in English | PREPRINT-RESEARCHSQUARE | ID: ppzbmed-10.21203.rs.3.rs-1838361.v1

ABSTRACT

Due to the constantly increasing number of mutations in the SARS-CoV-2 genome, concerns have emerged over the possibility of decreased diagnostic accuracy of reverse transcription-polymerase chain reaction (RT-PCR), the gold standard diagnostic test for SARS-CoV-2. We propose an analysis pipeline to discover genomic variations overlapping the target regions of commonly used PCR primer sets. We provide the list of these mutations in a publicly available format based on a dataset of more than 600,000 SARS-CoV-2 samples. Our approach distinguishes among mutations possibly having a damaging impact on PCR efficiency and ones anticipated to be neutral in this sense. Samples are categorized as „prone to misclassification” vs. „likely to be correctly detected” by a given PCR primer set based on the estimated effect of mutations present. Samples susceptible to misclassification are always present at a daily rate of 2% or lower, while the daily ratio of samples having a slight chance of misclassification with a particular primer set can reach 90%. As different variant strains may temporarily gain dominance in the worldwide SARS-CoV-2 viral population, the efficiency of a particular PCR primer set may change over time, therefore constant monitoring of variations in primer target regions is highly recommended.

5.

Host genomes for the unique SARS-CoV-2 variant leaked into Antarctic soil metagenomic sequencing data (preprint)

Istvan Csabai; Norbert Solymosi.

researchsquare; 2022.

Preprint in English | PREPRINT-RESEARCHSQUARE | ID: ppzbmed-10.21203.rs.3.rs-1330800.v1

ABSTRACT

Recently Csabai et al.1 have found a most likely contaminated metagenomic sample set from Antarctica that contained traces of unique SARS-CoV-2 variants. This is a short followup of that report where we attempt to find genetic footprint of the hosts. With reasonable confidence we could identify genetic material from mitochondria of Homo sapiens, green monkey and Chinese hamster. The latter two most probably originated from cell lines Vero E6 (or COS-7) and CHO respectively, which are frequent laboratory culture media for studying viruses including SARS-CoV-2 and its closest relatives.

6.

Unique SARS-CoV-2 variant found in public sequence data of Antarctic soil samples collected in 2018-2019 (preprint)

István Csabai; Krisztián Papp; Dávid Visontai; József Stéger; Norbert Solymosi.

researchsquare; 2021.

Preprint in English | PREPRINT-RESEARCHSQUARE | ID: ppzbmed-10.21203.rs.3.rs-1177047.v1

ABSTRACT

The COVID-19 pandemic has been going on for two years now and although many hypotheses have been put forward, its origin remain obscure. We investigated whether the huge public sequencing data archives’ samples collected earlier than the earliest known cases of the pandemic might contain traces of SARS-CoV-2. Here we report the bioinformatic analysis of a metagenome sample set collected from soil on King George Island, Antarctica between 2018-12-24 and 2019-01-13. It contains sequence fragments matching the SARS-CoV-2 reference genome with altogether more than half million nucleotides, covering the complete genome on average 17×. Preliminary phylogeny analysis places the sample close to the known earliest cases. The high sequence coverage rules out chance alignments from other species but possible laboratory contamination cannot be excluded. The sequence harbours a unique combination of mutations, unseen in other samples, so whatever its origin, it can add important piece of information to the puzzle of the ongoing pandemic.

Subject(s)

COVID-19

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL